Skip to content

[BUG](exec) fix coalesce function output null#63092

Open
qzsee wants to merge 3 commits into
apache:masterfrom
qzsee:fix-coalesce
Open

[BUG](exec) fix coalesce function output null#63092
qzsee wants to merge 3 commits into
apache:masterfrom
qzsee:fix-coalesce

Conversation

@qzsee
Copy link
Copy Markdown
Contributor

@qzsee qzsee commented May 8, 2026

What problem does this PR solve?

Issue Number: close #xxx

Example: COALESCE(same_department_income_amount, 0) ==> outputs NULL (where same_department_income_amount is of type double).

When assigning the value to the result column in the computation, the assignment is done unconditionally (forced), as in:

result_raw_data[row] +=
                    column_raw_data[row] *
                    typename ColumnType::value_type(!(null_map_data[row] | filled_flag[row]));

If the argument column column_raw_data's null_map[row] is 1, then the value stored in column_raw_data[row] is garbage data. This garbage may contain values such as NaN. If a preceding argument of COALESCE happens to be assigned NaN, then during subsequent assignments we run into cases like:

0 * NaN = NaN
num + NaN = NaN

so the assigned result also becomes NaN, which causes value pollution.

By rights the final output should also be NaN, but what is actually returned is NULL. The reason is that during result serialization/output, NaN values are emitted as NULL.

tatus DataTypeNumberSerDe<T>::_write_column_to_mysql(const IColumn& column,
                                                      MysqlRowBuffer<is_binary_format>& result,
                                                      int row_idx, bool col_const,
                                                      const FormatOptions& options) const {
    //...
    else if constexpr (std::is_same_v<T, float>) {
        if (std::isnan(data[col_index])) {
            // Handle NaN for float, we should push null value
            buf_ret = result.push_null();
        } else {
            buf_ret = result.push_float(data[col_index]);
        }
    } 
  //...
}

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@HappenLee HappenLee changed the title [BUG] fix coalesce function output null [BUG](exec) fix coalesce function output null May 8, 2026
@zclllyybb
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation change is small and directly addresses the reported COALESCE value-pollution path, but this PR currently has no regression or unit test for the bug it fixes. Because this is BE expression correctness code, Doris review standards require a targeted test before merge.

Critical checkpoint conclusions:

  • Goal/test: The code appears to accomplish the stated goal for scalar non-string COALESCE results by avoiding arithmetic with null-position payloads, but no test proves the reported nullable DOUBLE/NaN case or prevents regression.
  • Scope: The code change is focused and minimal.
  • Concurrency/lifecycle: Not applicable; this is per-block expression evaluation with no new shared state or lifecycle management.
  • Configuration/compatibility/protocol: Not applicable; no config, persisted format, or FE-BE protocol changes.
  • Parallel paths: String/array/map/struct/JSON and bitmap paths already use guarded insertion/assignment; the changed scalar path is the relevant divergent path.
  • Conditions/error handling: No new error path; the new guard is the necessary data-correctness condition.
  • Tests/results: Missing targeted regression/unit coverage and no expected result update was included.
  • Observability/transactions/data writes: Not applicable.
  • Performance: The new branch is in a hot scalar loop but is justified by correctness; no obvious blocking performance issue found.

Focus points: No additional user-provided review focus was supplied.

Comment thread be/src/exprs/vcondition_expr.cpp Outdated
column_raw_data[row] *
typename ColumnType::value_type(!(null_map_data[row] | filled_flag[row]));
if (!(null_map_data[row] | filled_flag[row])) {
result_raw_data[row] = column_raw_data[row];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug fix needs a targeted regression test. The PR description describes a concrete data-correctness failure where COALESCE(nullable_double, 0) can return NULL after a null-position payload contains NaN, but the PR only changes BE code and does not add a case under the existing conditional function/coalesce tests (or a BE unit test) that fails before this change and passes after it. Please add coverage for the nullable floating-point path so this does not regress.

Comment thread be/src/exprs/vcondition_expr.cpp Outdated
result_raw_data[row] +=
column_raw_data[row] *
typename ColumnType::value_type(!(null_map_data[row] | filled_flag[row]));
if (!(null_map_data[row] | filled_flag[row])) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test please and also check the document

@HappenLee
Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29271 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1409eb15ef01acad148594f53309b505c1c3be7e, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17635	3800	3796	3796
q2	q3	10701	861	597	597
q4	4660	463	347	347
q5	7468	1335	1130	1130
q6	185	168	138	138
q7	915	955	743	743
q8	9292	1373	1297	1297
q9	5564	5361	5259	5259
q10	6294	2064	1832	1832
q11	490	261	247	247
q12	673	421	286	286
q13	18161	3675	2713	2713
q14	285	281	262	262
q15	q16	902	885	781	781
q17	997	987	731	731
q18	6455	5656	5487	5487
q19	1180	1183	1041	1041
q20	509	406	256	256
q21	4782	2302	2019	2019
q22	463	385	309	309
Total cold run time: 97611 ms
Total hot run time: 29271 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4720	4499	4716	4499
q2	q3	4716	4832	4296	4296
q4	2181	2200	1453	1453
q5	5018	5020	5242	5020
q6	192	168	137	137
q7	2099	1835	1600	1600
q8	3362	3095	3094	3094
q9	8444	8560	8464	8464
q10	4483	4533	4267	4267
q11	602	410	386	386
q12	701	743	511	511
q13	3220	3594	2912	2912
q14	313	299	277	277
q15	q16	756	768	682	682
q17	1335	1304	1259	1259
q18	7972	7079	7113	7079
q19	1150	1113	1129	1113
q20	2232	2242	1992	1992
q21	6070	5385	4898	4898
q22	559	477	394	394
Total cold run time: 60125 ms
Total hot run time: 54333 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170437 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1409eb15ef01acad148594f53309b505c1c3be7e, data reload: false

query5	4316	665	514	514
query6	344	221	212	212
query7	4271	575	317	317
query8	345	232	215	215
query9	8833	4015	4005	4005
query10	450	346	297	297
query11	5862	2367	2214	2214
query12	188	141	132	132
query13	1277	622	419	419
query14	6821	5347	5058	5058
query14_1	4364	4356	4371	4356
query15	211	205	182	182
query16	1047	456	465	456
query17	1148	768	648	648
query18	2736	479	366	366
query19	230	210	185	185
query20	139	132	133	132
query21	221	142	119	119
query22	13560	13685	13371	13371
query23	17178	16455	16004	16004
query23_1	16175	16170	16258	16170
query24	7434	1765	1369	1369
query24_1	1341	1373	1363	1363
query25	646	524	476	476
query26	1316	332	175	175
query27	2655	576	336	336
query28	4405	1966	1956	1956
query29	1040	674	540	540
query30	305	235	207	207
query31	1124	1126	927	927
query32	90	76	70	70
query33	551	358	295	295
query34	1193	1105	645	645
query35	747	790	691	691
query36	1346	1371	1147	1147
query37	152	102	88	88
query38	3223	3131	3047	3047
query39	930	934	893	893
query39_1	884	905	866	866
query40	241	156	134	134
query41	69	63	62	62
query42	114	107	107	107
query43	331	326	285	285
query44	
query45	213	207	194	194
query46	1127	1166	728	728
query47	2353	2350	2218	2218
query48	407	414	308	308
query49	635	545	467	467
query50	699	287	215	215
query51	4298	4222	4184	4184
query52	105	104	97	97
query53	247	273	205	205
query54	314	270	251	251
query55	94	95	85	85
query56	300	299	307	299
query57	1440	1403	1339	1339
query58	304	279	276	276
query59	1549	1638	1449	1449
query60	349	344	335	335
query61	164	160	189	160
query62	674	614	563	563
query63	236	202	205	202
query64	2292	822	703	703
query65	
query66	1686	502	398	398
query67	29883	29831	29744	29744
query68	
query69	452	338	300	300
query70	1039	982	956	956
query71	315	273	273	273
query72	2954	2698	2517	2517
query73	866	779	438	438
query74	5052	4910	4729	4729
query75	2766	2660	2313	2313
query76	2311	1114	749	749
query77	416	430	349	349
query78	12967	12901	12410	12410
query79	1478	992	730	730
query80	1372	580	497	497
query81	525	280	239	239
query82	940	156	122	122
query83	354	270	250	250
query84	273	142	113	113
query85	907	506	475	475
query86	463	336	335	335
query87	3468	3345	3205	3205
query88	3493	2643	2614	2614
query89	442	386	346	346
query90	1910	179	176	176
query91	178	167	143	143
query92	82	79	78	78
query93	1143	952	563	563
query94	716	336	292	292
query95	656	476	353	353
query96	1045	758	353	353
query97	2708	2712	2544	2544
query98	242	234	235	234
query99	1104	1135	996	996
Total cold run time: 254750 ms
Total hot run time: 170437 ms

@dqz123
Copy link
Copy Markdown

dqz123 commented May 12, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29699 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ac1d83dd1e4f3423d0170575b4a7cd906b1456f6, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17615	3874	3838	3838
q2	q3	10699	876	630	630
q4	4660	465	349	349
q5	7443	1352	1138	1138
q6	194	167	138	138
q7	905	936	753	753
q8	9410	1402	1278	1278
q9	5996	5359	5314	5314
q10	6311	2066	1799	1799
q11	473	264	253	253
q12	687	408	286	286
q13	18198	3285	2752	2752
q14	292	280	259	259
q15	q16	899	876	784	784
q17	1022	947	819	819
q18	6484	5725	5607	5607
q19	1446	1273	1104	1104
q20	517	395	257	257
q21	4684	2393	2024	2024
q22	453	390	317	317
Total cold run time: 98388 ms
Total hot run time: 29699 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4677	4527	4502	4502
q2	q3	4678	4779	4236	4236
q4	2137	2175	1401	1401
q5	5041	5045	5272	5045
q6	197	168	134	134
q7	2056	1790	1606	1606
q8	3362	3131	3108	3108
q9	8475	8523	8491	8491
q10	4449	4525	4266	4266
q11	614	438	415	415
q12	687	744	511	511
q13	3318	3586	2923	2923
q14	295	309	273	273
q15	q16	738	786	693	693
q17	1338	1277	1262	1262
q18	7944	7028	7060	7028
q19	1197	1148	1194	1148
q20	2245	2268	1988	1988
q21	6108	5425	4917	4917
q22	531	475	400	400
Total cold run time: 60087 ms
Total hot run time: 54347 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170254 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ac1d83dd1e4f3423d0170575b4a7cd906b1456f6, data reload: false

query5	4317	639	528	528
query6	323	218	215	215
query7	4294	579	321	321
query8	331	228	214	214
query9	8821	4069	4040	4040
query10	457	340	292	292
query11	5799	2383	2260	2260
query12	184	132	132	132
query13	1302	603	441	441
query14	6723	5380	5058	5058
query14_1	4328	4328	4359	4328
query15	208	208	185	185
query16	1005	455	421	421
query17	1134	740	629	629
query18	2740	479	347	347
query19	216	204	161	161
query20	147	130	132	130
query21	215	142	118	118
query22	13569	13492	13313	13313
query23	17096	16414	15891	15891
query23_1	16045	16126	16067	16067
query24	7440	1777	1358	1358
query24_1	1374	1365	1372	1365
query25	597	539	485	485
query26	1170	321	173	173
query27	2716	602	347	347
query28	4435	1984	1967	1967
query29	1019	677	535	535
query30	309	246	208	208
query31	1118	1069	950	950
query32	92	80	74	74
query33	551	362	305	305
query34	1209	1118	648	648
query35	781	804	694	694
query36	1332	1370	1241	1241
query37	169	106	105	105
query38	3206	3123	3045	3045
query39	936	913	904	904
query39_1	880	886	876	876
query40	257	165	156	156
query41	72	69	68	68
query42	116	112	111	111
query43	325	332	298	298
query44	
query45	218	207	200	200
query46	1051	1226	726	726
query47	2354	2277	2196	2196
query48	405	412	308	308
query49	653	546	458	458
query50	718	284	223	223
query51	4349	4252	4229	4229
query52	109	108	97	97
query53	256	281	217	217
query54	326	288	282	282
query55	95	95	85	85
query56	312	333	326	326
query57	1423	1430	1336	1336
query58	329	291	281	281
query59	1553	1634	1425	1425
query60	352	356	342	342
query61	221	150	157	150
query62	667	653	558	558
query63	245	208	217	208
query64	2251	827	690	690
query65	
query66	1672	522	395	395
query67	30000	30017	29937	29937
query68	
query69	455	361	308	308
query70	1043	998	937	937
query71	320	275	266	266
query72	2938	2699	2425	2425
query73	833	742	437	437
query74	5069	4913	4786	4786
query75	2758	2679	2329	2329
query76	2321	1135	766	766
query77	434	420	356	356
query78	12910	12934	12283	12283
query79	1597	926	752	752
query80	1376	583	490	490
query81	518	286	234	234
query82	916	158	125	125
query83	340	278	250	250
query84	273	144	109	109
query85	898	549	449	449
query86	454	334	327	327
query87	3472	3360	3248	3248
query88	3526	2698	2619	2619
query89	459	382	335	335
query90	1903	182	178	178
query91	182	171	146	146
query92	83	78	79	78
query93	1232	970	548	548
query94	733	338	304	304
query95	687	380	350	350
query96	1099	784	333	333
query97	2715	2672	2581	2581
query98	244	225	241	225
query99	1112	1144	983	983
Total cold run time: 254810 ms
Total hot run time: 170254 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (6/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.62% (20650/38515)
Line Coverage 37.23% (195128/524084)
Region Coverage 33.60% (152467/453750)
Branch Coverage 34.61% (66501/192122)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (6/6) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.78% (27830/37718)
Line Coverage 57.63% (301250/522710)
Region Coverage 54.85% (251289/458169)
Branch Coverage 56.32% (108617/192850)

Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 12, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@zclllyybb
Copy link
Copy Markdown
Contributor

skip buildall

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants